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Abstract 

This paper presents self-contained proofs of the strong subadditivity 
inequality for von Neumann's quantum entropy, S{p), and some related in- 
equalities for the quantum relative entropy, most notably its convexity and 
its monotonicity under stochastic maps. Moreover, the approach presented 
here, which is based on Klein's inequality and Lieb's theorem that the func- 
tion A —I- Xre^"*"^"^^ is concave, allows one to obtain conditions for equal- 
ity. In the case of strong subadditivity, which states that S {P123) + S {p2) < 
S{pi2) + S{p23) where the subscripts denote subsystems of a composite 
system, equality holds if and only if log/9123 = logPi2 — log/02 -|- log/923- 
Using the fact that the Holevo bound on the accessible information in a 
quantum ensemble can be obtained as a consequence of the monotonicity 
of relative entropy, we show that equality can be attained for that bound 
only when the states in the ensemble commute. The paper concludes with 
an Appendix giving a short description of Epstein's elegant proof of Lieb's 
theorem. 
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1 Introduction 



1.1 Quantum Entropy 



Quantum information science is the study of the information carrying and 
processing properties of quantum mechanical systems. Recent work in this area 
has generated renewed interest in the properties of the quantum mechanical en- 



tropy. It is interesting to note that von Neumann P5| , ^ introduced the notion of 
mixed state, represented by a density matrix p (a positive semi-definite operator 
with Trp = 1), into quantum theory defined its entropy as as S{p) = — Tr(/9log p) 
in 1927, well before the corresponding classical quantity was introduced in Shan- 
non's seminal work on "The Mathematical Theory of Communication" in 



1948. (Admittedly, von Neumann's motivation was the extension of the classi- 
cal theory of statistical mechanics, developed by Gibbs, Boltzman, et al to the 
quantum domain rather than the development of a theory of quantum commu- 
nication.) Many fundamental properties of the quantum entropy were proved in 
a remarkable, but little- known, 1936 paper of Delbriick and Moleiere For 



further discussion of the history of quantum entropy, see [^, ^ ^ and the 
introductory remarks in [^ . 

One important class of inequalities relates the entropy of subsystems to that 
of a composite system, whose Hilbert space is a tensor product is 7ii2 = Tii ® 7^2 
of the Hilbert spaces for the subsystems. When the state of the composite system 
is described by the density matrix pi2, the states of the subsystems are given by 
the reduced density matrices, e.g., pi = ^2(^12), obtained by taking the partial 
trace. The subadditivity inequality 

Sipu) < 5(pi) + S{p2) (1) 

was proved in and 0|. (It should not be confused with the concavity 

S{xp' + (1 - x)p") > xS{p') + (1 - x)S{p") (2) 

which can actually be obtained from subadditivity by considering block matrices 
p6| , |28| , Wh)- III more complex situation in which the composite system is 
composed of three subsystems the following stronger inequality, known as strong 
subadditivity (SSA), holds. 

'S'(P123) + 'S'(P2) < 5'(pi2) + 5'(p23) (3) 



This inequality was conjectured by Lanford and Robinson in |24] and proved in 
|2^, |28|. In this paper, we review its proof in a form that easily yields the following 
condition for equality. 
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Theorem 1 Equality holds in strong subadditivity (0) if and only if 

log pl23 - log pl2 = log p23 - log P2- (4) 

We have suppressed imphcit tensor products with the identity so that, e.g., log pi2 
means (logpi2) ® h- Rewriting @ as log/9123 + log/92 = logpi2 + logp23, muhi- 
plying by P123 and taking the trace immediately establishes the sufficiency of this 
equality condition. In Section ^, we will also show that it is also necessary. 



1.2 Relative entropy 

The SSA inequality can be restated as a property of the quantum relative entropy 
which is defined as 



-^(P>7) = Trp(logp- log7). (5) 

It is usually assumed that p, 7 are density matrices, although (^) is well-defined 
for any pair of positive semi-definite matrices for which ker(7) C ker(p). Strong 
subadditivity can now be restated as 

H{pl2,p2) < H{pi23,p23) (6) 

where we again write, e.g., P23 for Ii P23- More generally, the relative entropy 
is monotone under completely positive, trace-preserving maps (also known as 
"quantum operations" ||3^ and "stochastic maps" |18| and discussed in more 



detail in section |3^ ), i.e., 

//[$(p),$(7)]<//(p,7)- (7) 
This monotonicity implies (^) when $ = Ts is the partial trace operation; perhaps 



surprisingly, the converse is also true This, and other connections between 



strong subadditivity and relative entropy are discussed in Section |5 

The approach to SSA presented here can also be used to obtain conditions 
for equality in properties of relative entropy, including its joint convexity and 
monotonicity. The explicit statements are postponed to later sections. Since the 
monotonicity can be used to give a simple proof of the celebrated Holevo bound 
fl^ , p^] on accessible information, we show how our results can be used to recover 
the equality conditions in that bound. As discussed in section Petz ^3, 3B| 



has also obtained several equality conditions in different, but equivalent, forms. 
However, Theorem |^, which applies to the most general form of monotonicity, 
appears to be new. 
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1.3 Lieb's convex trace functions 



One of the most frequently cited approaches to strong subadditivity is to present 
it as a consequence of the concavity of a quantity known as the Wigner-Yanase- 
This property, conjectured by Bauman 



Dyson entropy 49 



is equivalent to 



the joint concavity in A and B of the map 

{A, B) Tr A'K'^B^^-'^^K for A, B > 0, 



< s < 1 



(where f is used to denote the adjoint). Lieb's proof |^ of the concavity of the 
WYD function @ and his realization of a connection between SSA and Bau- 
man's concavity conjecture was a crucial breakthrough. However, concavity of 



the WYD function was only one of several concave trace functions studied in [25 
the following result was also established by Lieb. 



Theorem 2 For any fixed self-adjoint matrix K, the function A 
rpj,gK+iogA concave in A> 0. 



FiA) 



This result played a fundamental role in the original proof [0, ^ of SSA and 



the closely related property of joint concavity of the relative entropy |23, pO 



Although SSA is a deep theorem, a complete proof is not as forbidding as is 
sometimes implied. Therefore, for completeness, we include Epstein's elegant 
proof [|1^] of Theorem |^ in Appendix A, and then follow the original strategies of 
Lieb and Ruskai to show how it implies SSA. 



1.4 Overview 

Although this paper grew out of questions about the conditions for equality in 
strong subadditivity and related inequalities, it seems useful to present these 
conditions within a more comprehensive exposition. For simplicity, we confine 
our discussion to finite dimensions, and assume that, unless otherwise stated, the 
density matrices under consideration are strictly positive. 

The remainder of the paper is structured as follows. In Section ^ we dis- 
cuss some consequences and interpretations of the SSA equality condition. In 
Section |^ we summarize some mathematical results needed for the proofs in the 
sections that follow. Section ^ which might be regarded as the heart of the paper, 
presents the proof of strong subadditivity in a form which easily yields the equal- 
ity conditions. (A reader primarily interested in this proof can proceed directly to 
Section ^ with a willingness to accept the results of section ^.) Section |^ presents 
proofs with equality conditions for the monotonicity of the relative entropy under 
partial traces, the joint convexity of the relative entropy; and the general mono- 
tonicity under stochastic maps. This section also contains a discussion of the 
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connection between these properties, SSA and their proofs. Section |^ contains 
the proof of the equahty conditions for monotonicity of relative entropy. Section |^ 
consider bounds, most notably the Holevo bound, on the accessible information 
that can be extracted from an ensemble of quantum states, and the conditions 
under which they can be attained. The paper concludes with some additional 
historical comments in Section ^ 

2 Implications of the equality conditions for SSA 
2.1 Classical conditions 

To describe the corresponding classical inequalities, let the subsystems A, B and 
C correspond to classical random variables. One can recover the classical Shan- 
non entropy — Z^aPl*^) logp(^) from the von Neumann entropy by taking p to 
be a diagonal matrix with elements p{a) on the diagonal. Employing a slight 
abuse of notation, we write ^[^(a)] for this quantity. Then the classical strong 
subadditivity inequality can be stated as 

S[p{a, b, c)] + S[pib)] < S[pia, b)] + S[pib, c)]. (9) 

The classical relative entropy of the distribution q{a) with respect to p{a) is 
H\p(a),q{a)] = X]aP('^) log fjf}- It is well-known (see, e.g.,[^]) that the convex- 
ity of the function /(x) = xlogx implies that H[p{a),q{a)] > and its strict 
convexity implies that equality holds if and only if p{a) = q{a) V a. (The gener- 
alization of this result to quantum situations is discussed in section |3?T| .) 

The classical form (^) of SSA is equivalent to H[p{a, b, c), q{a, b, c)] > when 
the second distribution is g(a, b, c) = p{a, b)[p{b)]~^p{b, c). Thus, equality holds in 
(^ if and only if 

p{a,b,c) = p{a,b)[p{b)]~^p{b,c) \f a,b,c (10) 

which can be rewritten as 

\ogp{a,b,c) — log p{a,b) = log p{b, c) — log p{b) ^ a,b,c. (11) 

which is identical to what one would obtain from Theorem Using p{c\b) to 
denote the classical conditional probability distribution, (|llD can be rewritten as 

p{c\a,b) =p{c\b), (12) 

which is precisely the condition that the sequence A ^ B ^ C forms a Markov 
chain. 
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2.2 Special cases of SSA equality 



Some insight into equality condition (|^) may be obtained by looking at special 
cases in which it is satisfied. The most obvious is when P123 is a tensor product 
of its three reduced density matrices. However, it is readily verified that (Q) also 
holds when either P123 = pi®P23 or P123 = Pi2®p?,- One can generalize this slightly 
further. If the subsystem 2 can be partitioned further into two subsystems 2' and 
2", then one can verify equality holds if P123 = Pi2' ® P2"3, where pi2' and p2"3 are 
states of the composite systems 1,2' and 2", 3 respectively. 

However, such a decomposition into tensor products is not necessary; indeed, 
we have already seen that equality also holds for the case of classical Markov 
processes. Moreover, by comparison to (|I2D it is natural to regard @ as a kind 
of quantum Markov condition. Thus, the conditions in Theorem ^ can also be 
viewed as a natural non-commutative analogue of the conditions for equality in 
classical SSA. Another way of regarding @ is as a concise statement of a subtle 
intertwining condition discussed below. Unfortunately, we have not found explicit 
examples which satisfy it other than the two classes discussed above, that is, a 
partial decomposition into tensor products or a classical Markov chain. 



2.3 Petz's conditions 

Using a completely different approach, Petz ^ gave conditions for equality 
in d^) when $ can be identified with a mapping of an algebra onto a subalgebra, 
a situation which includes (^. In that case Petz's conditions become 

P\W = P123P23*- (13) 

Taking the derivative of both sides of ([l3|) at t = yields (|). Although ( |13|) 
appears stronger than (^), it is not since, as noted above, (H) is sufficient for 
equality in (H). Moreover, since (|D implies 

git log(pi23) _ g*t [logpi2-logP2+logP23] 

our results can be combined with those of Petz to see that equality holds in SSA 
<^==^ (i) ^^=^ (IH) and that any of these conditions suffices to imply 

git [logpi2-logp2+logp23] _ git log(pi2)g-it log{p2)git log(p23) ^-^^^ 

Note that one can also relate Petz's conditions to those for equality in classical 
SSA by rewriting (|TUp as p{a,b,c)[p{b,c)]~^ = p{a,b)[p{b)]~^ and then raising to 
the it power. 
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3 Fundamental mathematical tools 



3.1 Klein's inequality 

The fact that the relative entropy is positive, i.e., H{p,j) > when Trp = Tr7 
is an immediate consequence of the following fundamental convexity result due 
to Klein ITTl B2l ff^. 



Theorem 3 (Klein's Inequality) For A, B > 

TrA(logA-logB) > Tr(A-B), (16) 
with equality if and only if A = B. 

The closely related Peierls-Bogoliubov inequality |3^, ^ is sometimes used in- 
stead of Klein's inequality. However, the equality conditions in Theorem |^ play 
a critical role in the sections that follow. 



3.2 Lieb's golden corollary 

The proofs in Section || do not use Theorem || directly, but a related result gen- 
eralizing the following inequality, which we will also need. 

Theorem 4 ( Golden-Thompson- Symanzik) For self-adjoint matrices A and B 
rjYgA+B ^ Xre^e^ with equality if and only if A and B commute. 

Although this inequality is extremely well-known, the conditions for equality do 



not appear explicitly in such standard references as [16, 42, 47 1. However, one 
method of proof is based on the observation that Tr [e^/^ e^^^ ]^ is monotone 
decreasing in k, yielding e^~^^ in the limit as ^ oo. The equality conditions 
then follow easily from those for the Schwarz inequality for the Hilbert-Schmidt 
inner product TrC^D. Indeed, A; = 1 yields 



Tr(e^/2ei^/2)(eA/2eB/2) < 



Tre^/2g^gB/2 



1/2 



Tre^/^e^e^/^ 



1/2 



TreV 



with C = e^l'^e^l'^ and D = e^l'^e^l'^ . The equality condition that C is a multiple 
of D implies e^/^e^/^ = e^^'^e^^'^ which holds if and only if A and B commute. 
One reference that does discuss equality does so by making the interesting 
observation that (as shown in [^) Theorem ^ and its equality conditions, can be 
derived as a consequence of the monotonicity of relative entropy. Theorem |^. 
The natural extension to three matrices Tre'^^^^^ < iTre^e^e*^!, fails; see, 



for example. Problem 20 on pages 512-513 of [0. Therefore, the following result 
of Lieb p5i is particularly noteworthy. 
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Theorem 5 (Lieb) For any R,S,T > 

rp^glogR-logS + logT < / ^ rj. ^^^^ 

Jo S + ul S + ul 

One might expect that equahty holds if and only if R, S, T commute. Although 
this is sufficient, it is not necessary. One easily checks that both sides of ( p!7D 
equal Tr pi p23 when R = pi® p2® I^, S = Ii® p2® h.T = Ii® ^23, even when 
T does not commute with R or S. 



Proof: Lieb's proof of (|T^ begins with the easily-established fact that if 
F{A) is concave and homogeneous in the sense F{xA) = xF{A) , then 

^^J(A + .B)-F(A)^ (18) 

a;-»0 X 

Applying this to the functions in Theorem]^ with A = S, B = T, K = log i?— log S 
yields 

ry^ logR-logS+log(S+xT) _ rr. p 
- x^O X 

To complete the proof, we need the well-known integral representation 

iog(s + .r)-iogS = f (20) 

Substituting ( pOD into ( ]T^ and noting that 

poo 1 1 poo 1 1 

Tre^°§^+"io s+ui'^sWTTui'^^ = TtR + xTtR t^— du + Ofx^) 

Jo S + uI S + uI ^ ^ 

yields the desired result. QED 



3.3 Purification 

Araki and Lieb 0, ^ observed that one could obtain useful new entropy in- 
equalities by applying what is now known as the "purification process" to known 
inequalities. Any density pi can be extended to a pure state density matrix pi2 
on a tensor product space; moreover, S{pi) = S{p2)- Applying this to the sub- 
additivity inequality (|l|), i.e., S{pi2) < S{pi) + S{p2), yields the equivalent result 
Sips) < 'S'(p23) + S{p2) which can be combined with to give the triangle 
inequality [|, |g 

\S{pi) - S{p2)\ < S{pu) < + S{p2). (21) 
By purifying P123 to P1234 one can similarly show that SSA (H) is equivalent to 

S{p4) + Sip2)<Sip,2) + S{pu). (22) 
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3.4 Lindblad's representation of stochastic maps 



Stochastic maps arise naturally in quantum information as a description of the 
effect on a subsystem A interacting with the environment in the pure state jb = 
\'^b){4'b\ via the unitary operation Uab, 



Pa Trg (Uab Pa ® 7b Uab 



(23) 



Lindblad [^] used Stinespring's representation to show that any completely pos- 
itive trace-preserving map $ which maps an algebra into itself can be represented 
as if it arose in this way. That is, given such a map $ one can always find an 
auxiliary system, TCb, 9. density matrix 7^ on He, and a unitary map Uab on the 
combined system TCa ®'Hb (where A denotes the original system) such that 

<l>(p) = TrB(UABP®7BUiB) (24) 

where Tre denotes the partial trace over the auxiliary system. 

Using the Kraus representation $(p) = FkpFl (^^^ noting that the re- 
quirement that $ be trace-preserving is equivalent to Y^k ^l-^k = J^), one can give 
a construction equivalent to Lindblad's by initially defining Uab as 



Uab\ 



J2Fk\^P)®\k), 



(25) 



where is a fixed normalized state of the auxiliary system, and {\k)} is some 
orthonormal basis for the auxiliary system. Then Uab is a partial isometry from 
'Ha ® \l3){f^\ to Ha ® Hb which can be extended to a unitary operator on all of 
Ti-A ® Hb- This yields ( ^4]) with 7^ = a pure state. 

However, Uab can also be extended to Ha ® in other ways. In particular, 
it can be extended, instead, to the partial isometry for which UabUab is the 
projection onto Ha <S) \ f^){P\ so that Uab = on the orthogonal complement of 
Ha ® 1/5) We describe this in more detail when $ requires at most m Kraus 
operators F^, in which case one can choose the auxiliary system to be C™. One 
can also choose \k) = \ek), and \(3) = \ei) with \ek) the standard basis of column 
vectors with elements Cj = 5jk- Then ( p5D depends only on the first column of 
Uab which we denote V and regard as a map from Hio H® C™. In block form 

VpV^ = UABP®\ei){ei\U\B (26) 



F2 



P{FI 4 



Ft 



/ FipFi' FipF 



F^pFl F2PF, 



\FmpFl 



FipFl \ 
F2PFI 



FmpF^ J 
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from which it easily follows that TrB(VpVt) = EkFkpF[ = <^{p). The require- 
ment that $ be trace-preserving gives V^V = Efc FlFk = I which again implies 
that y is a partial isometry. Moreover, V pV'^ has the same non-zero eigenvalues 
as iy^)\V^) = pso that S[VpV^ = S{p). 

This construction can be readily extended to situations in which $ maps 
operators acting on one Hilbert space TCa to those acting on another space Ha', 
e.g., $ : B{T-Ca) ^ B(Ha')- In this case, the Kraus operators Fk : TCa ^ TCa', 
and Uab is a partial isometry from Ha <S> \f3){P\ to a subspace of Ha' ® TCb- 
Alternatively, V can be defined as a partial isometry from Ha to Ha' ® C™. 



3.5 Measurements and their representations 

A von Neumann or projective measurement is a partition of the identity I = J^t^b 
into mutually orthogonal projections, i.e., EbEc = SbcEb- A positive operator 
valued measurement (POVM) is a set of positive semi-definite operators Eb such 
that Eb = I, i.e., the orthogonality condition is dropped. It is well-known that 
a general POVM can be represented as a projective measurement on a tensor 
product space . 

In fact, by noting that the map p ^— J2b V^b p^/Eb is completely positive 
and trace-preserving with Kraus operators Fb = y/Eb one use the construction 
above. Write V = Zlft ® 1^) where \b) is an orthonormal basis for C*^ 
and M is the number of measurements in the POVM, i.e., b = 1 . . . M. Then 
VpV^ = J2b,cVKpVE'c® \b){c\. Now, if Ffe = J® \b){b\, then {Fb} is a projective 
measurement on 7i ® C*^ and Tr Fb (VpV'") = Tr Ebp. 



3.6 Adjoint maps 

It is sometimes useful to consider the adjoint, which we denote of a stochastic 
map $ with respect to the Hilbert-Schmidt inner product {A, B) = Tr A^B. When 
$ acts on n X n matrices, this adjoint (or dual) is fully defined by the requirement 

Tr [$(A)]^B = Tr A^$(B). (27) 

for all n X n matrices. A, B. Indeed, when $(p) = '}2k ^kpE^, the adjoint is given 
by $(p) = J2k F^lpFk- Moreover, $ is trace-preserving if and only if $ is unital, 
i.e, $(/) = /. When $ is the partial trace, T2, its adjoint takes A i-* A ^ I2. 



4 Subadditivity proofs 

To understand the proof of strong subadditivity, it is instructive to first under- 
stand how Klein's inequahty can be used to prove two weaker inequalities. First, 
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we consider the subadditivity inequality (|1|). Substituting A = pi2 and B = pi®p2 
into Klein's inequality (|16D yields 

- 5(pi2) + 5(pi) + S{p2) > Tr(pi2 - Pi ® P2) = 0, (28) 

which is equivalent to subadditivity. Furthermore, the well-known conditions 
for equality in subadditivity follow from the conditions for equality in Klein's 
inequality, namely that equality holds if and only if pi2 is a tensor product, that 

is, pi2 = Pl® P2- 

A second, more powerful subadditivity inequality was obtained by Araki and 
Lieb i, 

5(P123) < 5(P12) + S{P2Z) (29) 

under the constraint Trpi23 = 1. To prove this, choose A = P123 and B = 
giogpi2+iogp23 in Klein's inequality to obtain 

-S{p^23) + Sip,2) + S{p2^) > l-Trei°sP-+i°gP23. (gg) 
Applying Theorem ^, to the right-hand side gives 

-5'(pi23) + 5'(pi2) + 5'(p23) > 1 - Tri23Pl2P23 

= 1-Tr2(p2)' 
> l-Tr2P2 = 0, 

where the last line follows from (p2)^ < P2 (which is the only place the normal- 
ization condition Trpi23 = 1 is needed). QED 

The strategy for proving SSA is similar to that above, but with Theorem ^ 
replaced by Theorem |. Let A = P123 and choose B so that log B = log pi2 — 
logp2 -|- logp23. Then Klein's inequality implies 

-'S'(pi23) + 'S'(pi2) - 5'(p2) + S'(p23) 

> Tr (pi23 - e^°sw2-iogP2+iogP23^ _ ^31) 
Applying Lieb's result {^17\) to the right-hand side above, we obtain 

-'S'(P123) + 5'(pi2) - 5'(p2) + 5'(p23) 

> Tr ( P123 - / P12 — FP23 — — -pdu ) 
\^ Jo P2 + ul P2 + ul / 

roo I I 

= Tri23 P123 - Tr2 / P2 ■ — 7P2 ■ — fdu 

Jo P2 + ul P2 + ul 

= (Tri23 P123 - Tr2P2) = 0. 
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This proves SSA. Moreover, this approach allows us to easily determine the condi- 
tions for equality, and thus complete the proof of Theorem |l|. The first inequality 
in the derivation above is satisfied with equality if and only if ^4 = i? which is just 
the condition @. Although the conditions for equality in (p!7[) are more difficult 
to analyze, this is not necessary here. When A = B, it immediately follows that 
Tr A = TrB so that the second inequality in the above derivation automatically 
becomes an equality when holds. 



5 Inequalities for relative entropy 
5.1 Monotonicity under partial trace 

We now show how the same strategy can be applied to obtain a proof with equality 
conditions for the monotonicity of relative entropy under partial trace. 

Theorem 6 When pu, 712 > and Tipu = Tr7i2 

H{p2,72)<Hip^2,ll2) (32) 

with equality if and only if log pu — log 712 = log 72 + log p2 ■ 

This condition should be interpreted as log pi2 — log 712 = h® log 72 — log p2 ■ 
Since, as noted in section RTO, when $ = Ti, the action $ is precisely Ji®, the 



equality condition can be written as log pi2 — log 712 = Ti log Ti (712) — log Ti (pi2 
which is a special case of the more general form (|40|) developed later. 

SSA can be regarded as a special case of this monotonicity result via the 
correspondence pi2 P123, 712 P12, and Petz's form of the equality condition 
becomes ^2*72"** = Pi27i2**- It is interesting to note that in p8[, Lieb and Ruskai 
actually obtained equation (|3^ ) from SSA using the convexity of the conditional 
entropy S{pi) — S{pi2) and the inequality 



Proof: Let A = pi2, logB = log 712 — log 72 + logp2- Then Klein's inequality 
and (IT^) imply 



i/(pi2,7i2)-^(p2,72) > Tri2(pi2-e^°^^^^-^°^^^+i°^^^ 



00 



> Tri2 P12 - / 7i2 ■ — 7P2 ■ — rdu 

\ Jo 72 + ul 72 + ul / 

/•oo X 1 

= Tri2Pi2-Tr2/ 72 ■ — -p2 ; — fdu 

Jo 72 + ul 72 + ul 

= Tri2 P12 - Tr2P2 = 0. 
The equality condition is again precisely the condition A = B. QED 
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5.2 Joint convexity of the relative entropy 



The joint convexity of relative entropy can be obtained directly from Theorem |^ 
by choosing pi2 (and similarly 712) to be a block diagonal matrix with blocks 
Afcp(^') (and Afc7W). We can interpret the partial trace as a sum over blocks so 
that p = P2 = Y.k ^kP^'^^- However, it is worth giving a direct proof of the joint 
convexity since it demonstrates the central role of Theorem ^ 

Theorem 7 The relative entropy is jointly convex in its arguments, i.e., if p = 
J2k AfcP^''^ and -f = Ek ^kl^^\ then 

i^(p,7)<E^^•^(p^'^7^'0 (33) 

k 

with equality if and only if log p — log 7 = log p^^^ — log 7^'^-' for all k. 

Proof: Let A = p'^^^ and log I? = logp — log 7 + log7*^'^) with p = J2k ^kP^'^^ and 
7 = Y.k ^kl^'^^ ■ Then Klein's inequahty implies 

H {p^^^ , 7^^^^) - Tr p('^)[logp - log 7] > Tr (p - e^osP-iogi+iogj^'^^^^ (34) 
Multiplying this by with A^ > and J2k^k = ^ yields, after summation, 
Y.hH{p^'\^^'^)-H{p,^) 

k 

= Tr (p-e^^s") = 



where the second inequality is precisely the concavity of C — F{C) = Tre^"*"'"^'" 
with K = log p — log 7 and C = J2k ^kl^^^ ■ 



5.3 Relationships among inequalities 

We make some additional remarks about connections between SSA and various 
properties of relative entropy. To facilitate the discussion, we will use MONO to 
denote the general monotonicity inequality (|^, MPT to denote the special case 
of monotonicity under partial traces, i.e.. Theorem ^, and JC to denote the joint 
convexity. Theorem |^. Using the restatement of SSA in the form (|^), it is easy 
to see that MONO MPT SSA. Before theorem 0, we showed that MPT 
=r- JC. Similarly, by choosing P123 to be block diagonal with blocks p5^23 one can 
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show that SSA imphes that the map pi2 i— * S{pi) — S{pi2) is convex. In [|28[ 
it was observed that applying the convexity inequahty (|l^) to this map (with 
A + xB = pi2 + 0:712), yields (|32D. This shows that SSA ^ MPT so that we have 
the chain of implications 



MONO ^ MPT 



SSA JC. 



(35) 



One can show that JC =^ MPT by using Uhlmann's observation |^3| that the 
partial trace can be written as a convex combination of unitary transformations. 

One can also show directly that JC =^ SSA by using the purification process 
described in section 0| to show that SSA is equivalent to 



P4 + P2< Pl2 + PlA- 



(36) 



Moreover, if P124 is pure, then p4 = pi2 and p2 = pu so that equality holds in (p6|). 
Since the extreme points of the convex set of density matrices are pure states, the 
inequality ( |36D then follows from the joint convexity. Theorem |^. Thus we have 



MONO MPT SSA ^ JC. 
Lindblad PT[ completed this circuit by showing that MPT 



(37) 



MONO. 



Using the representation described in Section p.4| , with V the partial isometry 
from Ti to Ti® C" as in (|26|) , one finds 



i/[$(p),$(7)] = //[TrB(VpV^), TrB(V7V^) 
< H[VpV\V-iV^' 
= Hip,^) 

since TrVpV"'^ log(V7V^) = Tr plog7 for a partial isometry V. 



(38) 
(39) 



6 Equality in monotonicity under stochastic maps 

Conditions for equality in the general monotonicity inequality (^ may be more 
subtle since it is not always possible to achieve equality. Indeed, it was noted in 
p9[] that supp^^ ^^'^H{py)'^''^ strictly less than 1. Using the reformulation 

( p8|) above, we prove the following result. 

Theorem 8 Equality holds in (0), i/[$(p), $(7)] < H{p,^), if and only if 

log p - log 7 = $ [log $(p) - log $(7)] (40) 
where $ denotes the adjoint of ^ with respect to the Hilbert- Schmidt inner product 



as defined in 
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To verify sufficiency, multiply ( ^OD by p and take the trace to obtain 

i/[p,7] = Tr/9$[log<l>(p) -log<l>(7)] 
= Tr<l>(p)[log$(p)-log<l>(7)] 
= //[<|.(p),$(7)]. 

It is tempting to follow our previous strategy and choose A = \ogB = 
log7 + $[log$(p) — log$(7)]. However, we have been unable to verify that 
rp^giog7+$[iogi>(p)-iog-i.(7)] < 1 as required by this approach. 

Instead, we use the representation (^) or (|26|) . Rather than applying the 
equality conditions in Theorem |] directly to (0), it is useful to repeat the argu- 
ment for an appropriate choice of A and B. 

Proof: Choose A = VpV^, hgB = \og{V^V^) + logTr2(VpVt) - logTr2(V7Vt) 
where V is again the partial isometry as in ( PB| ) of Section |3.4| . B is defined 
so that the last two terms in log-B are extended from Ti to Ti ® so that 
ker(i?) C ker(j4). The condition for equality in (|38|) is then 

\og{ypV^)-\og{y^V^) = logTr2(VpV"^) -logTr2(V7V^) (41) 

= log$(p) -log<l>(7) 

We can put this into a more useful form by noting that for a partial isometry V 



log (VpV^) - log (^7^"^) = v\\ogp- log 7] 



(42) 



from which it follows that (|41|) is equivalent to 



V 



logp — log 7 = log$(p) — log $(7). 



(43) 



Multiplying by V'^ on the left and V on the right and using that V'^V = I, 
sees that (^) implies 



one 



log p — log 7 = V'" log$(p) — log$(7) 



V. 



yields 



(44) 



smce 



Taking the partial trace Tr2 over the auxiliary space in 
HP) = Efc FlPFk = V^PV for all P in H. QED 

Another useful necessary condition for equality in (|^) can be obtained by 
multiplying both sides of (|43| ) by the projection VV'^ . Since V'^V = I, one finds 



ry^flog<l>(p) -log$(7) 



V 



log p - log 7 



log<l>(p) -log<l>(7) 



(45) 



i.e., the projection VV'^ commutes with [log$(p) — log $(7)]. Taking the partial 
trace and noting that $(/) = Tr2VV^ we can summarize this discussion in the 
following 
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Corollary 9 // equality holds in (|^), then 

<l>(logp-log7) = $(/) [Iog$(p) -log<l>(7)] = [log$(p) -log<l>(7)]$(/). (46) 

Moreover, log $(p)— log $(7) commutes with the projection VY'^ = J2k/ \k) {^\FkFl 
where {Fk} is a set of Kraus operators for $, i.e., $(p) = J2kFkpFl and \k) is 
an orthonormal basis for the auxiliary space 7^2- 

The results of this section also hold in the more general situation when $ : 
^O^a) ^ iiiaps operators on one Hilbert space to those on another, in 

which case F^ : H-a) ^ 'H'j^. 



7 The Holevo bound 
7.1 B ackground 

One reason for studying conditions for equality is that other results, such as 



Holevo's celebrated bound [jT4| on the accessible information, can be obtained 
rather easily from SSA or some form of the monotonicity of relative entropy. 
However, obtaining the corresponding conditions for equality is not as straight- 
forward as one might hope because of the need to introduce an auxiliary system. 
Although Holevo's bound is quite general, it is often applied in situations where 
Pj = ^{pj) is the output of a noisy quantum channel $ with input pj. We use the 
tilde ~ as a reminder of this, as well as to ensure a distinction from other density 
matrices which arise. 

For any fixed POVM and density matrix 7, p{b) = Tr (7Eb) defines a classical 
probability distribution whose entropy we denote 5'[Tr7Eb]. The Holevo bound 
states that for any ensemble of density matrices S = {iTjPj} with average density 
matrix p = J2j '^jPj the accessible information in the ensemble satisfies 

I{£M) = 5[TrpEb]-^7rjS[TrpjEb] (47) 

j 

< S{p) -Y.^,S{p,) (48) 

i 

for any POVM M. = {Ef,}.. If all of the pj commute, then it is easy to see that 
equality can be achieved by choosing the Ef, to be the spectral projections which 
simultaneously diagonalize the density matrices pj. We wish to show that this 
condition is also necessary, i.e., equality can only be achieved in ( ^8)) if all the pj 
commute. 
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It is known [|T^, [5^] that (PSf ) can be obtained from (|^. First, observe that 

Sip) - y: ^jSip,) = y: ^jHip,, p) (49) 

Now let Qm be the map ^Mi^) = Eb \b){b\ Tr(AEb) where M = {Ef,}. Then Qm 
is a stochastic map of the special type known as a Q-C channel and the Holevo 
bound (ESI) follows immediately from (HO) and 



H[nM{p,),^M{P)]<H{p„p). (50) 
7.2 Equality conditions 

We will henceforth assume that {nj,pj} is a fixed ensemble and seek conditions 
under which we can find a POVM satisfying the equality requirements. Since 
Qm{D) = J2bEb{b, Db), applying Theorem || yields conditions for equality in 
(^Ol). For equality in (|48|) these conditions must hold for every j and reduce to 



log pj -\ogp = J2 Eb log ^J' I V j (51) 

where this should be interpreted as a condition on ker(pj)-'- in which case all 
terms are well-defined. (Indeed, since the condition arises from the use of Klein's 
inequality and the requirement A = B, the operators in B must be defined to 
be zero on ker(A), which reduces to ker(pj) in the situation considered here.) 
If the POVM {Eb} consists of a set of mutually orthogonal projections, then it 
is immediate that the operators Zj = log pj — log p commute, since ( pT| ) can be 
regarded as the spectral decomposition of Zj. To show that the pj themselves 
commute, observe that 

1 = Trpj = Tr e^°^'^"^ 

< Trp e'°s?J-i°s? 

V EUog^ 

= Trpe^' 

= Trp 2^ Eb ^ 
b TrEbP 

= $:TrEbPj = l 

b 

where we have used Theorem ^ with A = log p, B = log pj — log p, and the fact 
that for orthogonal projections e^b"'>'^>' = J2b^"'''Eb- The conditions for equality 
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in Theorem ^ then imply that logpj and logp commute for all j. Hence pj and 
Pk also commute for all j, k when the POVM consists of mutually orthogonal 
projections. 

Using King's observation in the next section, one can reduce the general case 
to that of projective measurements. However, we prefer to use the equality con- 
ditions to show directly that the elements of the POVM must be orthogonal. 
Moreover, the commutativity condition involving VV"^ is reminiscent of the more 
sophisticated Connes cocyle approach used by Petz, and thus of some interest. 

Since the Kraus operators for the Q-C map Qm can be chosen as F^b = 
\b){k\^/Eb where \b) and |A;) are orthonormal bases, one finds 

= EE {k./E'k^J,) = \b){c\ {(t>4Eb4Ec<P). (52) 

b,c k,t b,c 

By (^5|), this must commute for all j with \ogflM{Pj) ~log^>i(Pi) which can be 
written in the form Y.b^bj\b){b\ with Zbj = log^^^r^. A diagonal operator of the 
form Yl,b^b\b){b\ with all 2;f, 7^ will commute with the projection in (|5^) if and 
only if all off-diagonal terms are zero. This will hold if the POVM is a projective 
measurement, since then A/Efe-y/E^ = Ei,Ec = Eh6bc- To see that this is necessary, 
note that the possibility that the vector is orthogonal to all Ej, is precluded 
by the condition that J2b^b = I- Moreover, since the orthonormal basis \k) is 
arbitrary, can be chosen to be arbitrary. The restriction that (^) hold only on 
ker(pj)-'- may permit some Z},j = 0; however, for each b there will always be at 
least one j for which Zj,j 7^ 0, and this suffices. QED 

One can obtain an alternate form of the equality conditions from Corollary ^. 
Since $(/) = J2b |^)(^|TrEb, another necessary condition for equality in ( ^HD is 

TrEb [log Pj - log p] = TrEb ( log TrEbPj - log TrEbp) V j , b (53) 

Inserting this in ( |5l| ) yields the requirement 

log Pj - log p = J2 T^r^^bTr Eb [ log pj - log p] (54) 
b -^^Eb 

which can be rewritten as 




where Zj = log pj — log p and the bra-ket now refer to the Hilbert-Schmidt inner 
product. This implies that J2b^'^^^^ projects onto the span({Zj}). However, 
this alone is not sufficient to imply that the Eh form a projective measurement. 
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7.3 Other approaches 



Chris King has observed that when the POVM is a projective measurement of 
the form Eb = \b){b\, one can obtain the Holevo bound from the joint convexity 
of relative entropy. Let (3{p) = X^fe |^)(^|TrEbp. Then applying Theorem ^ to 
H[p,p{p)] yields 

-S{p) + ^(TrEbp) < E [ - S(Pj) + S(TrEbPj)] (56) 

j 

^(TrEbp) - E^jS(TrEbPj) < S(p) - E^jS(Pj) 



or 



with equality if and only if 

\ogp-Y.\b) {b\ log TrEbP = log Pj - E |b) (b| log TrE^pj V j . (57) 

b b 

This is equivalent to (^Tj) when Ef, = \b){b\, and the argument can be extended 
to more general projective measurements. 

King also pointed out that if {Eb} is an arbitrary POVM, the construction 
in Section ^3] can be used to show that (^) and (^1]) are equivalent to the 
equalities obtained when pj is replaced by VpjV^ and Eb by Fb. Since the {F^} 
form a projective measurement, we can conclude from the argument above that 
equality implies that all VpjV^ commute, which implies that all pj also commute 
since VW = I. 

It should be noted that Petz was able to use his equality conditions to find the 



conditions for equality in the Holevo bound and this is sketched in ||3^. Indeed, 
Petz's analogue of (|57|) is p**!)"** = V j where D, Dj denotes the diagonal 



parts of p, Pj respectively. Then 



pi* = p^'D-''Df. (5^ 



Since ( |5^ ) holds for all real t, as well as all j, it also implies pj = p ^^D^^D^ 



However, taking the adjoint of (|58D yields p^"** = Dj^^D^^p'^^. Therefore, p~** 
commutes with the diagonal matrix D^^DJ^^ = DJ^^D^^ and must also be diago- 
nal. This gives a simultaneous diagonalization of all p^* which means that all pj 
commute. 



Holevo's original longer derivation of the bound (|48| ) also concluded that 



commutativity was necessary and sufficient for equality. Some simplifications of 



this argument were given by Fuchs [0 in his thesis. 
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7.4 Another bound on accessible information 

When p is a density matrix, the mapping A p^^^'^Ap^^^'^ and its inverse gives 
a duahty between ensembles and POVM's. Hall observed that this duality 



can be used to give another upper bound on the accessible information p7|) in 
terms of the POVM and average density p, i.e., 

I{S,M) < S{p)-Y.nS{^^^pE,^) (59) 

b 

= J2nH{^^^E,^,p) (60) 

b 

where Tb = TrEbp. This inequality can be obtained from the monoticity of 
relative entropy under the Q-C map i^s^A) = Y.j\j){j\'^jP~^^'^PjP~^^'^ applied 
to H (^^y/p Ef,^ , pj as in (|50|); or as in |]I9| where an equivalent bound was 
given. The argument in Section ^]2| can then be used to show that equality can 
be achieved in ( ^9]) if and only if all y/pEb-s/p commute. Hall [|1^ also found this 
condition and noted that it implies that p commutes with every E^ in the POVM. 

One is often interested in ( |18| ) and ( pUj ) when one wants to optimize the 
accessible information after using a noisy quantum channel, $. It was observed 
19(1 that, since Tr$(pj)Eb = Trpj$(Eb), one can regard the noise as either 



m 



acting to transform pure inputs pj to mixed state outputs $(pj) or as acting 
through the adjoint $ on the POVM with uncorrupted outputs. In the first 
case, one can bound the right side of ( ^9] ) by choosing the E^ to be the spectral 
projections of the average output state $(p) to yield /[$(£), < ^[^(p)] which 
is weaker than the corresponding Holevo bound. Moreover, since the optimal 
choice for $(pj) need not be in the image of $, it not necessarily achievable even 



though the commutativity condition holds. Hall [|T^ discussed other situations in 
which the bound can not be achieved despite the fact that all ^E^^ commute. 
Viewing the noise as acting on the POVM, King and Ruskai [|l^ defined 



Uep{^) = sup 

P,M 



S{p)-Y.'^bS{^^^HEb)^) 



(61) 



with Tb = Trp$(Eb) = Tr$(p)Eb. If the supremum in ( pTj ) is achieved with an 
average density and POVM for which y/p^(Eb)y/p do not commute, then Uep{^) 
is strictly greater than the accessible information. The questions of whether or 
not ( |6TD can actually exceed the optimal accessible information, and how it might 
then be interpreted are under investigation. 
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8 Concluding remarks 



The proof presented here for each inequahty, SSA, Theorem ^, Theorem ^ and 
the general monotonicity (|^ , is quite short — only half a page using results from 
Section ^ which require less than one additional page and Theorem . However, 
as shown in the Appendix, even this result does not require a long argument if 
one is permitted to use some powerful tools of complex analysis. 

It is certainly not unusual to find that complex analysis can extremely be 
useful, even when the functions of interest are real-valued. Indeed, Lieb's origi- 
nal proof of the concavity of WYD entropy used a complex interpolation argu- 
ment. In his influential book [I^ on Trace Ideals, Simon (extracting ideas from 
Uhlmann [Q]) gave a longer "elementary" proof using the Schwarz inequality, per- 
haps inadvertently reinforcing the notion that any complete proof of SSA is long 
and forbidding. Similar ideas are implicit in Ando who restates the result in 
terms of tensor product spaces and block matrices. Uhlmann [Q] again demon- 
strated the power of complex interpolation by using it to prove the monotonicity 
of relative entropy under completely positive trace-preserving maps. SSA then 
follows immediately as a special case. However, Uhlmann's approach, which has 
been extended by Petz [|35| , was developed within the framework of the relative 
modular operator formalism developed by Araki |^, Q for much more general 
situations. Recently, Lesniewski and Ruskai [2£] observed that within this relative 
modular operator framework, monotonicity can be established directly using an 
argument based on the Schwarz inequality. 

The approach of this review is similar to that of Wehrl in that we view 



Theorem as the "essential ingredient". Indeed, Uhlmann H3, 47|, using a com- 



pletely different approach, had independently recognized that Theorem ^ would 
imply SSA. However, Wehrl's otherwise excellent review stated (at the end of 
section III.B) that "Unfortunately, the proof of [this] is not easy at all." Later (in 
section III.C) Werhl again states that "... the proof is surprisingly complicated. I 
want to indicate only that the concavity of Tr e^"''^"^^ can be obtained from Lieb's 
theorem [on concavity of the WYD entropy] through a sequence of lemmas." Al- 
though aware that Epstein's approach |Tl|, which was developed shortly after 
Lieb announced his results, permitted a "direct" proof of Theorem ^ Wehrl does 
not seem to have fully appreciated it. The utility of Epstein's technique may have 
been underestimated, in part, because he presented his results in a form which 
applied to the full collection of convex trace functions studied in [^. Checking 
Epstein's hypotheses for the WYD function requires some non-trivial mapping 
theorems. This may have obscured the elegance of the argument in Appendix A. 

It is worth noting that if the concavity of WYD entropy is regarded as the 
key result, it is not necessary to use the long sequence of lemmas Wehrl refers 
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to in order prove SSA. Lindblad ||3^ gave a direct proof of the joint convexity, 
Theorem directly by differentiating the WYD function. Once this is done, SSA 
follows via the purification argument sketched after equation (|36| ) or, alternatively, 
the variant of Uhlmann's argument described in Combining this with 



Lieb's original complex interpolation proof of the concavity of the WYD function, 
yields another "short" proof of SSA, albeit one which does not appear to be well- 
suited to establishing conditions for equality. 

Finally, we mention that Carlen and Lieb ||^ obtained another proof of SSA 
by using Epstein's technique to prove some Minkowski type inequalities for Lp 
trace norms. Using a different approach. King |20, 21 1 recently proved several 
additivity results for the minimal entropy and Holevo capacity of a noisy channel 
by using Lp inequalities in which Epstein's technique provided a critical estimate. 
This suggests that connections with Lp inequalities, as advocated by Amosov, 
Holevo and Werner [^, may be a promising avenue for studying entropy and 
capacity in quantum information. Despite the results mentioned above, many 
open conjectures remain; see ^ ^, ^ for further details. 
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A Epstein's proof of concavity of A ^ q^j^gK+iogA 

Let f{x) = Tre^+'°s(^+^^) with A>0 strictly positive and K, B self-adjoint. For 
sufficiently small x, the function f{x) is well-defined and the concavity of F{A) 
in Theorem |^ follows immediately if /"(O) < for all choices oi B = B* . 

Instead of dealing with / directly, Epstein considered the function g{x) = 
xf{x~^) which is well-defined for |x| > /i^^ = ||v4~^|| and can be analytically 
continued to the upper half plane so that 

^(2) =Tre^+'°s(^^+^\ (62) 

There are a number of equivalent (when meaningful) ways of defining functions of 
matrices. For the purposes needed here it is natural to assume that the spectrum 
(j{A) of the operator A is contained in the domain of an analytic function F{z) 
and that 

One can then use the spectral mapping theorem (T[F(y4.)] C F[o"(yl)] for an 
appropriate sequence of functions to verify that 

Qz>0 Qu{zA + B) >0 

^ n> '^uj[log{zA + B)] > 

Tr>Quj[K + log{zA + B)]>0 

^ c>TreK+'°s(^^+^) > 



where denotes the imaginary part of a complex number and u is used to denote 
an arbitrary element of the spectrum of the indicated operator. Thus, g{z) maps 
the upper half plane into the upper half plane. Functions with this property have 
been studied extensively under various names, including, "operator monotone", 
"Herglotz" or "Pick". (See, for example, 0, [1^, ^]). It then follows that g has 
an integral representation of the form 

f^J■ 1 

g(z) = a + bz+ / dm{t) (64) 

J —u t — Z 



for some positive measure /i(t). This yields (via the change of variables s = t 



f{x) = ax + b+ -dm{t) (65) 

J — fA tX 1 

Differentiation under the integral sign can then be used to establish that /"(O) < 
as desired by observing = t~'^[{xt + 1) + {xt — 1)""^]. QED 
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